Overview

Dataset statistics

Number of variables13
Number of observations3817
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory361.7 KiB
Average record size in memory97.0 B

Variable types

Numeric9
Categorical3
Boolean1

Warnings

isLastSmallPeriod has constant value "True" Constant
startDate has a high cardinality: 1217 distinct values High cardinality
endDate has a high cardinality: 1447 distinct values High cardinality
minDate has a high cardinality: 1026 distinct values High cardinality
df_index has unique values Unique
s_bias has unique values Unique
m_bias has unique values Unique
l_bias has unique values Unique

Reproduction

Analysis started2021-01-10 15:13:24.467959
Analysis finished2021-01-10 15:13:32.591240
Duration8.12 seconds
Software versionpandas-profiling v2.10.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct3817
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10680.87006
Minimum1
Maximum21001
Zeros0
Zeros (%)0.0%
Memory size29.9 KiB

Quantile statistics

Minimum1
5-th percentile1088.4
Q15433
median10739
Q316075
95-th percentile19994.4
Maximum21001
Range21000
Interquartile range (IQR)10642

Descriptive statistics

Standard deviation6089.162892
Coefficient of variation (CV)0.5700998945
Kurtosis-1.203120547
Mean10680.87006
Median Absolute Deviation (MAD)5325
Skewness-0.03841756531
Sum40768881
Variance37077904.72
MonotocityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20491
 
< 0.1%
194601
 
< 0.1%
177301
 
< 0.1%
33951
 
< 0.1%
186961
 
< 0.1%
13541
 
< 0.1%
34031
 
< 0.1%
75011
 
< 0.1%
13581
 
< 0.1%
75051
 
< 0.1%
Other values (3807)3807
99.7%
ValueCountFrequency (%)
11
< 0.1%
41
< 0.1%
121
< 0.1%
151
< 0.1%
211
< 0.1%
ValueCountFrequency (%)
210011
< 0.1%
209991
< 0.1%
209961
< 0.1%
209921
< 0.1%
209871
< 0.1%

code
Real number (ℝ≥0)

Distinct300
Distinct (%)7.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean380205.8108
Minimum1
Maximum603993
Zeros0
Zeros (%)0.0%
Memory size29.9 KiB

Quantile statistics

Minimum1
5-th percentile538
Q12311
median600104
Q3600886
95-th percentile601901
Maximum603993
Range603992
Interquartile range (IQR)598575

Descriptive statistics

Standard deviation280415.1958
Coefficient of variation (CV)0.7375352711
Kurtosis-1.635403866
Mean380205.8108
Median Absolute Deviation (MAD)1529
Skewness-0.5457610591
Sum1451245580
Variance7.863268202 × 1010
MonotocityIncreasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60001523
 
0.6%
60004823
 
0.6%
60116622
 
0.6%
217922
 
0.6%
96321
 
0.6%
60038320
 
0.5%
60000020
 
0.5%
214220
 
0.5%
60008520
 
0.5%
60001620
 
0.5%
Other values (290)3606
94.5%
ValueCountFrequency (%)
116
0.4%
215
0.4%
6313
0.3%
6916
0.4%
10012
0.3%
ValueCountFrequency (%)
60399311
0.3%
6039865
0.1%
6038997
0.2%
6038336
0.2%
6037999
0.2%

lossRate
Real number (ℝ)

Distinct3787
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.167572457
Minimum-0.6474201474
Maximum-0.002478314746
Zeros0
Zeros (%)0.0%
Memory size29.9 KiB

Quantile statistics

Minimum-0.6474201474
5-th percentile-0.3639826316
Q1-0.2071078431
median-0.15
Q3-0.1025316456
95-th percentile-0.04903941719
Maximum-0.002478314746
Range0.6449418327
Interquartile range (IQR)0.1045761976

Descriptive statistics

Standard deviation0.09693163845
Coefficient of variation (CV)-0.5784461252
Kurtosis3.456829427
Mean-0.167572457
Median Absolute Deviation (MAD)0.05128755365
Skewness-1.56155528
Sum-639.6240684
Variance0.009395742533
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.14
 
0.1%
-0.11111111113
 
0.1%
-0.23
 
0.1%
-0.12643678162
 
0.1%
-0.20763358782
 
0.1%
-0.17791411042
 
0.1%
-0.22620380742
 
0.1%
-0.13636363642
 
0.1%
-0.081352833642
 
0.1%
-0.23076923082
 
0.1%
Other values (3777)3793
99.4%
ValueCountFrequency (%)
-0.64742014741
< 0.1%
-0.64733581161
< 0.1%
-0.62320772061
< 0.1%
-0.61134544071
< 0.1%
-0.61044957471
< 0.1%
ValueCountFrequency (%)
-0.0024783147461
< 0.1%
-0.0024855012431
< 0.1%
-0.0043751
< 0.1%
-0.0050150451351
< 0.1%
-0.0077969946131
< 0.1%

startDate
Categorical

HIGH CARDINALITY

Distinct1217
Distinct (%)31.9%
Missing0
Missing (%)0.0%
Memory size29.9 KiB
2016-08-15
 
30
2010-11-08
 
30
2015-01-05
 
28
2019-04-08
 
28
2020-08-17
 
27
Other values (1212)
3674 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters38170
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique497 ?
Unique (%)13.0%

Sample

1st row2011-05-20
2nd row2011-07-29
3rd row2013-03-06
4th row2013-10-30
5th row2014-08-04
ValueCountFrequency (%)
2016-08-1530
 
0.8%
2010-11-0830
 
0.8%
2015-01-0528
 
0.7%
2019-04-0828
 
0.7%
2020-08-1727
 
0.7%
2018-02-0525
 
0.7%
2020-03-0522
 
0.6%
2015-06-1519
 
0.5%
2019-04-1919
 
0.5%
2019-04-0918
 
0.5%
Other values (1207)3571
93.6%
Histogram of lengths of the category
ValueCountFrequency (%)
2016-08-1530
 
0.8%
2010-11-0830
 
0.8%
2015-01-0528
 
0.7%
2019-04-0828
 
0.7%
2020-08-1727
 
0.7%
2018-02-0525
 
0.7%
2020-03-0522
 
0.6%
2015-06-1519
 
0.5%
2019-04-1919
 
0.5%
2019-04-0918
 
0.5%
Other values (1207)3571
93.6%

Most occurring characters

ValueCountFrequency (%)
09178
24.0%
-7634
20.0%
16892
18.1%
26630
17.4%
31277
 
3.3%
51181
 
3.1%
91181
 
3.1%
81117
 
2.9%
71079
 
2.8%
41008
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number30536
80.0%
Dash Punctuation7634
 
20.0%

Most frequent character per category

ValueCountFrequency (%)
09178
30.1%
16892
22.6%
26630
21.7%
31277
 
4.2%
51181
 
3.9%
91181
 
3.9%
81117
 
3.7%
71079
 
3.5%
41008
 
3.3%
6993
 
3.3%
ValueCountFrequency (%)
-7634
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common38170
100.0%

Most frequent character per script

ValueCountFrequency (%)
09178
24.0%
-7634
20.0%
16892
18.1%
26630
17.4%
31277
 
3.3%
51181
 
3.1%
91181
 
3.1%
81117
 
2.9%
71079
 
2.8%
41008
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII38170
100.0%

Most frequent character per block

ValueCountFrequency (%)
09178
24.0%
-7634
20.0%
16892
18.1%
26630
17.4%
31277
 
3.3%
51181
 
3.1%
91181
 
3.1%
81117
 
2.9%
71079
 
2.8%
41008
 
2.6%

endDate
Categorical

HIGH CARDINALITY

Distinct1447
Distinct (%)37.9%
Missing0
Missing (%)0.0%
Memory size29.9 KiB
2020-10-09
 
96
2019-05-10
 
24
2019-05-13
 
23
2019-05-14
 
22
2019-05-15
 
19
Other values (1442)
3633 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters38170
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique590 ?
Unique (%)15.5%

Sample

1st row2011-06-16
2nd row2011-08-10
3rd row2013-04-16
4th row2013-12-24
5th row2014-10-09
ValueCountFrequency (%)
2020-10-0996
 
2.5%
2019-05-1024
 
0.6%
2019-05-1323
 
0.6%
2019-05-1422
 
0.6%
2019-05-1519
 
0.5%
2015-07-0618
 
0.5%
2015-07-0817
 
0.4%
2015-07-0316
 
0.4%
2019-05-0916
 
0.4%
2019-05-0816
 
0.4%
Other values (1437)3550
93.0%
Histogram of lengths of the category
ValueCountFrequency (%)
2020-10-0996
 
2.5%
2019-05-1024
 
0.6%
2019-05-1323
 
0.6%
2019-05-1422
 
0.6%
2019-05-1519
 
0.5%
2015-07-0618
 
0.5%
2015-07-0817
 
0.4%
2015-07-0316
 
0.4%
2019-05-0916
 
0.4%
2019-05-0816
 
0.4%
Other values (1437)3550
93.0%

Most occurring characters

ValueCountFrequency (%)
09115
23.9%
-7634
20.0%
16898
18.1%
26739
17.7%
91376
 
3.6%
31375
 
3.6%
51142
 
3.0%
71122
 
2.9%
61004
 
2.6%
8921
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number30536
80.0%
Dash Punctuation7634
 
20.0%

Most frequent character per category

ValueCountFrequency (%)
09115
29.9%
16898
22.6%
26739
22.1%
91376
 
4.5%
31375
 
4.5%
51142
 
3.7%
71122
 
3.7%
61004
 
3.3%
8921
 
3.0%
4844
 
2.8%
ValueCountFrequency (%)
-7634
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common38170
100.0%

Most frequent character per script

ValueCountFrequency (%)
09115
23.9%
-7634
20.0%
16898
18.1%
26739
17.7%
91376
 
3.6%
31375
 
3.6%
51142
 
3.0%
71122
 
2.9%
61004
 
2.6%
8921
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII38170
100.0%

Most frequent character per block

ValueCountFrequency (%)
09115
23.9%
-7634
20.0%
16898
18.1%
26739
17.7%
91376
 
3.6%
31375
 
3.6%
51142
 
3.0%
71122
 
2.9%
61004
 
2.6%
8921
 
2.4%

minDate
Categorical

HIGH CARDINALITY

Distinct1026
Distinct (%)26.9%
Missing0
Missing (%)0.0%
Memory size29.9 KiB
2019-05-09
 
84
2019-05-06
 
69
2015-07-08
 
64
2020-02-03
 
60
2018-02-09
 
41
Other values (1021)
3499 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters38170
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique424 ?
Unique (%)11.1%

Sample

1st row2011-06-16
2nd row2011-08-08
3rd row2013-04-15
4th row2013-12-24
5th row2014-09-22
ValueCountFrequency (%)
2019-05-0984
 
2.2%
2019-05-0669
 
1.8%
2015-07-0864
 
1.7%
2020-02-0360
 
1.6%
2018-02-0941
 
1.1%
2015-02-0635
 
0.9%
2020-09-1035
 
0.9%
2015-07-0734
 
0.9%
2020-03-2333
 
0.9%
2016-01-1128
 
0.7%
Other values (1016)3334
87.3%
Histogram of lengths of the category
ValueCountFrequency (%)
2019-05-0984
 
2.2%
2019-05-0669
 
1.8%
2015-07-0864
 
1.7%
2020-02-0360
 
1.6%
2018-02-0941
 
1.1%
2015-02-0635
 
0.9%
2020-09-1035
 
0.9%
2015-07-0734
 
0.9%
2020-03-2333
 
0.9%
2016-01-1128
 
0.7%
Other values (1016)3334
87.3%

Most occurring characters

ValueCountFrequency (%)
09163
24.0%
-7634
20.0%
26772
17.7%
16478
17.0%
91457
 
3.8%
31446
 
3.8%
61133
 
3.0%
71102
 
2.9%
51100
 
2.9%
81051
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number30536
80.0%
Dash Punctuation7634
 
20.0%

Most frequent character per category

ValueCountFrequency (%)
09163
30.0%
26772
22.2%
16478
21.2%
91457
 
4.8%
31446
 
4.7%
61133
 
3.7%
71102
 
3.6%
51100
 
3.6%
81051
 
3.4%
4834
 
2.7%
ValueCountFrequency (%)
-7634
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common38170
100.0%

Most frequent character per script

ValueCountFrequency (%)
09163
24.0%
-7634
20.0%
26772
17.7%
16478
17.0%
91457
 
3.8%
31446
 
3.8%
61133
 
3.0%
71102
 
2.9%
51100
 
2.9%
81051
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII38170
100.0%

Most frequent character per block

ValueCountFrequency (%)
09163
24.0%
-7634
20.0%
26772
17.7%
16478
17.0%
91457
 
3.8%
31446
 
3.8%
61133
 
3.0%
71102
 
2.9%
51100
 
2.9%
81051
 
2.8%

isLastSmallPeriod
Boolean

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
True
3817 
ValueCountFrequency (%)
True3817
100.0%

smallPeriodDays
Real number (ℝ≥0)

Distinct69
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.12549122
Minimum2
Maximum94
Zeros0
Zeros (%)0.0%
Memory size29.9 KiB

Quantile statistics

Minimum2
5-th percentile10
Q118
median24
Q331
95-th percentile45
Maximum94
Range92
Interquartile range (IQR)13

Descriptive statistics

Standard deviation10.68405729
Coefficient of variation (CV)0.4252277972
Kurtosis1.054874715
Mean25.12549122
Median Absolute Deviation (MAD)6
Skewness0.678057988
Sum95904
Variance114.1490801
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21222
 
5.8%
22200
 
5.2%
23172
 
4.5%
24166
 
4.3%
25159
 
4.2%
20154
 
4.0%
26151
 
4.0%
19125
 
3.3%
18124
 
3.2%
17121
 
3.2%
Other values (59)2223
58.2%
ValueCountFrequency (%)
212
0.3%
315
0.4%
421
0.6%
525
0.7%
621
0.6%
ValueCountFrequency (%)
941
< 0.1%
751
< 0.1%
711
< 0.1%
702
0.1%
691
< 0.1%

bigPeriodDays
Real number (ℝ≥0)

Distinct225
Distinct (%)5.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56.09693477
Minimum11
Maximum363
Zeros0
Zeros (%)0.0%
Memory size29.9 KiB

Quantile statistics

Minimum11
5-th percentile13
Q126
median45
Q371
95-th percentile143
Maximum363
Range352
Interquartile range (IQR)45

Descriptive statistics

Standard deviation43.12892079
Coefficient of variation (CV)0.7688284747
Kurtosis5.456888496
Mean56.09693477
Median Absolute Deviation (MAD)22
Skewness1.971304334
Sum214122
Variance1860.103809
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2274
 
1.9%
2174
 
1.9%
1169
 
1.8%
1369
 
1.8%
1867
 
1.8%
1664
 
1.7%
1463
 
1.7%
2863
 
1.7%
2362
 
1.6%
1761
 
1.6%
Other values (215)3151
82.6%
ValueCountFrequency (%)
1169
1.8%
1259
1.5%
1369
1.8%
1463
1.7%
1560
1.6%
ValueCountFrequency (%)
3631
< 0.1%
3411
< 0.1%
3321
< 0.1%
3291
< 0.1%
2742
0.1%

riseRange
Real number (ℝ≥0)

Distinct3787
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.066637485
Minimum0.5049599416
Maximum6.843658966
Zeros0
Zeros (%)0.0%
Memory size29.9 KiB

Quantile statistics

Minimum0.5049599416
5-th percentile0.8150045853
Q10.9131282496
median0.9747596154
Q31.092771353
95-th percentile1.617230818
Maximum6.843658966
Range6.338699024
Interquartile range (IQR)0.1796431033

Descriptive statistics

Standard deviation0.3599133146
Coefficient of variation (CV)0.3374279638
Kurtosis66.23820412
Mean1.066637485
Median Absolute Deviation (MAD)0.07728857739
Skewness6.253659882
Sum4071.355279
Variance0.129537594
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
114
 
0.4%
0.93333333333
 
0.1%
1.0833333332
 
0.1%
0.95698924732
 
0.1%
1.0140845072
 
0.1%
0.98461538462
 
0.1%
0.9849397592
 
0.1%
0.95121951222
 
0.1%
1.0843373492
 
0.1%
0.97849462372
 
0.1%
Other values (3777)3784
99.1%
ValueCountFrequency (%)
0.50495994161
< 0.1%
0.51184834121
< 0.1%
0.56513720971
< 0.1%
0.57664197271
< 0.1%
0.57887323941
< 0.1%
ValueCountFrequency (%)
6.8436589661
< 0.1%
6.7667230141
< 0.1%
5.6201708011
< 0.1%
5.3498679761
< 0.1%
5.32901351
< 0.1%

s_bias
Real number (ℝ)

UNIQUE

Distinct3817
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.09832955228
Minimum-0.09367243928
Maximum1.109560968
Zeros0
Zeros (%)0.0%
Memory size29.9 KiB

Quantile statistics

Minimum-0.09367243928
5-th percentile0.008254124361
Q10.04507181773
median0.07668566002
Q30.1239437576
95-th percentile0.2545718276
Maximum1.109560968
Range1.203233407
Interquartile range (IQR)0.07887193983

Descriptive statistics

Standard deviation0.0936454214
Coefficient of variation (CV)0.952362939
Kurtosis20.3508024
Mean0.09832955228
Median Absolute Deviation (MAD)0.03701673905
Skewness3.35866059
Sum375.323901
Variance0.008769464948
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.051315897061
 
< 0.1%
0.2115362781
 
< 0.1%
0.054022462771
 
< 0.1%
0.091398270531
 
< 0.1%
0.11748331371
 
< 0.1%
0.0099750623441
 
< 0.1%
0.040993788821
 
< 0.1%
0.10922641991
 
< 0.1%
0.052837573391
 
< 0.1%
0.087564125241
 
< 0.1%
Other values (3807)3807
99.7%
ValueCountFrequency (%)
-0.093672439281
< 0.1%
-0.072087802441
< 0.1%
-0.071036997961
< 0.1%
-0.063941033371
< 0.1%
-0.062290689891
< 0.1%
ValueCountFrequency (%)
1.1095609681
< 0.1%
1.0189424281
< 0.1%
0.9391253161
< 0.1%
0.92993479151
< 0.1%
0.90831017931
< 0.1%

m_bias
Real number (ℝ≥0)

UNIQUE

Distinct3817
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.08041813909
Minimum5.072794603 × 105
Maximum0.9735135588
Zeros0
Zeros (%)0.0%
Memory size29.9 KiB

Quantile statistics

Minimum5.072794603 × 105
5-th percentile0.004240978127
Q10.02844626816
median0.06110714306
Q30.1090962833
95-th percentile0.2195913781
Maximum0.9735135588
Range0.9734628309
Interquartile range (IQR)0.08065001519

Descriptive statistics

Standard deviation0.07826992518
Coefficient of variation (CV)0.9732869483
Kurtosis14.48346138
Mean0.08041813909
Median Absolute Deviation (MAD)0.03784713042
Skewness2.767151613
Sum306.9560369
Variance0.006126181188
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.012851537721
 
< 0.1%
0.11338313781
 
< 0.1%
0.071859639731
 
< 0.1%
0.0047143751611
 
< 0.1%
0.08891941021
 
< 0.1%
0.045453005931
 
< 0.1%
0.034888697081
 
< 0.1%
0.029741072191
 
< 0.1%
0.15848085891
 
< 0.1%
0.024413042281
 
< 0.1%
Other values (3807)3807
99.7%
ValueCountFrequency (%)
5.072794603 × 1051
< 0.1%
6.386593263 × 1051
< 0.1%
7.363770251 × 1051
< 0.1%
7.365804254 × 1051
< 0.1%
9.950496281 × 1051
< 0.1%
ValueCountFrequency (%)
0.97351355881
< 0.1%
0.8026873551
< 0.1%
0.72080587461
< 0.1%
0.66590702921
< 0.1%
0.65782890651
< 0.1%

l_bias
Real number (ℝ≥0)

UNIQUE

Distinct3817
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0618161654
Minimum7.3067902 × 106
Maximum0.7002089755
Zeros0
Zeros (%)0.0%
Memory size29.9 KiB

Quantile statistics

Minimum7.3067902 × 106
5-th percentile0.001496452173
Q10.01451021473
median0.04044242964
Q30.08946215884
95-th percentile0.187324015
Maximum0.7002089755
Range0.7002016688
Interquartile range (IQR)0.07495194412

Descriptive statistics

Standard deviation0.06558592217
Coefficient of variation (CV)1.060983349
Kurtosis8.457706688
Mean0.0618161654
Median Absolute Deviation (MAD)0.03120548192
Skewness2.183813523
Sum235.9523033
Variance0.004301513186
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.064938373031
 
< 0.1%
0.009885678471
 
< 0.1%
0.11857333351
 
< 0.1%
0.023231922531
 
< 0.1%
0.051285359551
 
< 0.1%
0.16109167811
 
< 0.1%
0.069820480241
 
< 0.1%
0.12842245271
 
< 0.1%
0.16012240881
 
< 0.1%
0.11662495061
 
< 0.1%
Other values (3807)3807
99.7%
ValueCountFrequency (%)
7.3067902 × 1061
< 0.1%
4.118090214 × 1051
< 0.1%
4.335746679 × 1051
< 0.1%
4.692245567 × 1051
< 0.1%
5.984732272 × 1051
< 0.1%
ValueCountFrequency (%)
0.70020897551
< 0.1%
0.63859462691
< 0.1%
0.54709179831
< 0.1%
0.46231043531
< 0.1%
0.45392571341
< 0.1%

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcodelossRatestartDateendDateminDateisLastSmallPeriodsmallPeriodDaysbigPeriodDaysriseRanges_biasm_biasl_bias
011-0.1292262011-05-202011-06-162011-06-16True18270.8765090.0150540.0656340.026031
141-0.0772782011-07-292011-08-102011-08-08True9140.934922-0.0016980.0073070.023201
2121-0.2346392013-03-062013-04-162013-04-15True28591.0647360.1295360.1897920.151592
3151-0.1875002013-10-302013-12-242013-12-24True40420.8910890.1468620.0871000.010167
4211-0.0795972014-08-042014-10-092014-09-22True43701.0440120.0844870.0329020.037013
5281-0.1566792015-01-052015-02-132015-02-06True30691.2786430.0862120.2102160.084151
6341-0.2035862015-06-082015-06-262015-06-26True14651.1180750.0815380.0522240.123846
7421-0.0661162016-08-152016-10-102016-09-26True34641.0351870.0591970.0368120.013633
8451-0.0611402016-12-092017-01-042016-12-28True18280.9521830.0224620.0221850.008483
9461-0.0502092017-02-202017-03-312017-03-30True30300.9592050.0264120.0007340.004940

Last rows

df_indexcodelossRatestartDateendDateminDateisLastSmallPeriodsmallPeriodDaysbigPeriodDaysriseRanges_biasm_biasl_bias
380720969603993-0.4614152015-06-052015-07-032015-07-02True201511.4865590.1428220.0929780.176045
380820972603993-0.3443712015-08-132015-08-252015-08-25True9140.7806450.0610050.0328660.041628
380920973603993-0.1652542015-12-212016-01-052016-01-05True11110.834746-0.0510660.1040930.002544
381020974603993-0.1553902016-07-072016-08-122016-08-01True27270.8678950.0964910.1219410.003791
381120978603993-0.2345452017-02-232017-05-032017-04-24True47551.0146130.2544190.0736670.020045
381220987603993-0.2107732017-09-112017-11-022017-10-18True34881.4247970.2268350.0770820.182364
381320992603993-0.2105262018-03-192018-04-252018-04-17True26390.9149180.1395310.1049490.046862
381420996603993-0.2318272019-04-082019-05-082019-05-06True20390.8185570.0561260.0931460.052772
381520999603993-0.3191882020-02-142020-03-172020-03-16True23430.8484160.2488480.0853170.050942
381621001603993-0.1849592020-08-062020-09-102020-09-10True26280.8568380.0962570.1406180.012180